Silesia Province
Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution
Qin, Tianrui, Chen, Qianben, Wang, Sinuo, Xing, He, Zhu, King, Zhu, He, Shi, Dingfeng, Liu, Xinxin, Zhang, Ge, Liu, Jiaheng, Jiang, Yuchen Eleanor, Gao, Xitong, Zhou, Wangchunshu
Large language models (LLMs) have demonstrated remarkable capabilities in complex reasoning tasks when equipped with external tools. However, current frameworks predominantly rely on sequential processing, leading to inefficient execution particularly for tasks requiring extensive tool interaction. This paper introduces Flash-Searcher, a novel parallel agent reasoning framework that fundamentally reimagines the execution paradigm from sequential chains to directed acyclic graphs (DAGs). Flash-Searcher decomposes complex tasks into subtasks with explicit dependencies, enabling concurrent execution of independent reasoning paths while maintaining logical constraints. Through dynamic workflow optimization, our framework continuously refines the execution graph based on intermediate results, effectively integrating summary module. Comprehensive evaluations across multiple benchmarks demonstrate that Flash-Searcher consistently outperforms existing approaches. Specifically, it achieves 67.7% accuracy on BrowseComp and 83% on xbench-DeepSearch, while reducing agent execution steps by up to 35% compared to current frameworks. Furthermore, when distilling this parallel reasoning pipeline into single models, we observe substantial performance gains across diverse backbone architectures, underscoring the generalizability of our methodology. Our work thus represents a significant advance in agent architecture design, offering a more scalable and efficient paradigm for complex reasoning tasks.
- Asia > Russia (0.45)
- Europe > Russia (0.27)
- South America > Brazil (0.14)
- (30 more...)
- Workflow (1.00)
- Research Report > New Finding (0.67)
- Leisure & Entertainment (0.93)
- Media > Music (0.68)
- Government (0.67)
Rank, Chunk and Expand: Lineage-Oriented Reasoning for Taxonomy Expansion
Mishra, Sahil, Arjun, Kumar, Chakraborty, Tanmoy
Taxonomies are hierarchical knowledge graphs crucial for recommendation systems, and web applications. As data grows, expanding taxonomies is essential, but existing methods face key challenges: (1) discriminative models struggle with representation limits and generalization, while (2) generative methods either process all candidates at once, introducing noise and exceeding context limits, or discard relevant entities by selecting noisy candidates. We propose LORex (Lineage-Oriented Reasoning for Taxonomy Expansion), a plug-and-play framework that combines discriminative ranking and generative reasoning for efficient taxonomy expansion. Unlike prior methods, LORex ranks and chunks candidate terms into batches, filtering noise and iteratively refining selections by reasoning candidates' hierarchy to ensure contextual efficiency. Extensive experiments across four benchmarks and twelve baselines show that LORex improves accuracy by 12% and Wu & Palmer similarity by 5% over state-of-the-art methods.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Arctic Ocean (0.06)
- (9 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Acer just announced two new gaming laptops with great specs and sleek designs
Acer just revealed two new gaming laptops at IEM Katowice, a Counter-Strike tournament in Poland. These are entries in the company's Predator Helios Neo AI line of laptops, so they are filled to the brim with both bells and whistles. The Helios Neo 16 AI and 18 AI can be outfitted with up to an Intel Core Ultra 9 275HX processor and the NVIDIA GeForce RTX 5070 Ti Laptop GPU. Both of these computers also boast sleek, minimalist designs, with RGB logos on the lid and "dynamic 4-zone" RGB keyboards. They support up to 64GB of RAM and up to 2TB of internal storage.
- Europe > Poland > Silesia Province > Katowice (0.28)
- North America > United States (0.08)
- Information Technology > Hardware (1.00)
- Information Technology > Artificial Intelligence (1.00)
Enhancing Coronary Artery Calcium Scoring via Multi-Organ Segmentation on Non-Contrast Cardiac Computed Tomography
Nalepa, Jakub, Bartczak, Tomasz, Bujny, Mariusz, Gośliński, Jarosław, Jesionek, Katarzyna, Malara, Wojciech, Malawski, Filip, Miszalski-Jamka, Karol, Rewa, Patrycja, Kostur, Marcin
Despite coronary artery calcium scoring being considered a largely solved problem within the realm of medical artificial intelligence, this paper argues that significant improvements can still be made. By shifting the focus from pathology detection to a deeper understanding of anatomy, the novel algorithm proposed in the paper both achieves high accuracy in coronary artery calcium scoring and offers enhanced interpretability of the results. This approach not only aids in the precise quantification of calcifications in coronary arteries, but also provides valuable insights into the underlying anatomical structures. Through this anatomically-informed methodology, the paper shows how a nuanced understanding of the heart's anatomy can lead to more accurate and interpretable results in the field of cardiovascular health. We demonstrate the superior accuracy of the proposed method by evaluating it on an open-source multi-vendor dataset, where we obtain results at the inter-observer level, surpassing the current state of the art. Finally, the qualitative analyses show the practical value of the algorithm in such tasks as labeling coronary artery calcifications, identifying aortic calcifications, and filtering out false positive detections due to noise.
- Research Report > New Finding (0.68)
- Research Report > Experimental Study (0.46)
Carefully Structured Compression: Efficiently Managing StarCraft II Data
Ferenczi, Bryce, Newbury, Rhys, Burke, Michael, Drummond, Tom
Creation and storage of datasets are often overlooked input costs in machine learning, as many datasets are simple image label pairs or plain text. However, datasets with more complex structures, such as those from the real time strategy game StarCraft II, require more deliberate thought and strategy to reduce cost of ownership. We introduce a serialization framework for StarCraft II that reduces the cost of dataset creation and storage, as well as improving usage ergonomics. We benchmark against the most comparable existing dataset from \textit{AlphaStar-Unplugged} and highlight the benefit of our framework in terms of both the cost of creation and storage. We use our dataset to train deep learning models that exceed the performance of comparable models trained on other datasets. The dataset conversion and usage framework introduced is open source and can be used as a framework for datasets with similar characteristics such as digital twin simulations. Pre-converted StarCraft II tournament data is also available online.
- Oceania > Australia > Victoria > Melbourne (0.04)
- Europe > Poland > Silesia Province > Katowice (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (2 more...)
Measurements with Noise: Bayesian Optimization for Co-optimizing Noise and Property Discovery in Automated Experiments
Slautin, Boris N., Liu, Yu, Dec, Jan, Shvartsman, Vladimir V., Lupascu, Doru C., Ziatdinov, Maxim, Kalinin, Sergei V.
We have developed a Bayesian optimization (BO) workflow that integrates intra-step noise optimization into automated experimental cycles. Traditional BO approaches in automated experiments focus on optimizing experimental trajectories but often overlook the impact of measurement noise on data quality and cost. Our proposed framework simultaneously optimizes both the target property and the associated measurement noise by introducing time as an additional input parameter, thereby balancing the signal-to-noise ratio and experimental duration. Two approaches are explored: a reward-driven noise optimization and a double-optimization acquisition function, both enhancing the efficiency of automated workflows by considering noise and cost within the optimization process. We validate our method through simulations and real-world experiments using Piezoresponse Force Microscopy (PFM), demonstrating the successful optimization of measurement duration and property exploration. Our approach offers a scalable solution for optimizing multiple variables in automated experimental workflows, improving data quality, and reducing resource expenditure in materials science and beyond.
- North America > United States > Tennessee > Knox County > Knoxville (0.14)
- North America > United States > Washington > Benton County > Richland (0.04)
- Europe > Poland > Silesia Province > Katowice (0.04)
- Europe > Germany (0.04)
- Workflow (1.00)
- Research Report (1.00)
- Energy (1.00)
- Government > Regional Government (0.46)
Reasoning Factual Knowledge in Structured Data with Large Language Models
Huang, Sirui, Gu, Yanggan, Hu, Xuming, Li, Zhonghao, Li, Qing, Xu, Guandong
Large language models (LLMs) have made remarkable progress in various natural language processing tasks as a benefit of their capability to comprehend and reason with factual knowledge. However, a significant amount of factual knowledge is stored in structured data, which possesses unique characteristics that differ from the unstructured texts used for pretraining. This difference can introduce imperceptible inference parameter deviations, posing challenges for LLMs in effectively utilizing and reasoning with structured data to accurately infer factual knowledge. To this end, we propose a benchmark named StructFact, to evaluate the structural reasoning capabilities of LLMs in inferring factual knowledge. StructFact comprises 8,340 factual questions encompassing various tasks, domains, timelines, and regions. This benchmark allows us to investigate the capability of LLMs across five factual tasks derived from the unique characteristics of structural facts. Extensive experiments on a set of LLMs with different training strategies reveal the limitations of current LLMs in inferring factual knowledge from structured data. We present this benchmark as a compass to navigate the strengths and weaknesses of LLMs in reasoning with structured data for knowledge-sensitive tasks, and to encourage advancements in related real-world applications. Please find our code at https://github.com/EganGu/StructFact.
- Europe > Poland > Masovia Province > Warsaw (0.04)
- Asia > China > Xinjiang Uygur Autonomous Region (0.04)
- Asia > China > Hong Kong (0.04)
- (18 more...)
- Leisure & Entertainment > Sports (1.00)
- Information Technology (0.92)
- Health & Medicine (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Towards consistency of rule-based explainer and black box model -- fusion of rule induction and XAI-based feature importance
Kozielski, Michał, Sikora, Marek, Wawrowski, Łukasz
Rule-based models offer a human-understandable representation, i.e. they are interpretable. For this reason, they are used to explain the decisions of non-interpretable complex models, referred to as black box models. The generation of such explanations involves the approximation of a black box model by a rule-based model. To date, however, it has not been investigated whether the rule-based model makes decisions in the same way as the black box model it approximates. Decision making in the same way is understood in this work as the consistency of decisions and the consistency of the most important attributes used for decision making. This study proposes a novel approach ensuring that the rule-based surrogate model mimics the performance of the black box model. The proposed solution performs an explanation fusion involving rule generation and taking into account the feature importance determined by the selected XAI methods for the black box model being explained. The result of the method can be both global and local rule-based explanations. The quality of the proposed solution was verified by extensive analysis on 30 tabular benchmark datasets representing classification problems. Evaluation included comparison with the reference method and an illustrative case study. In addition, the paper discusses the possible pathways for the application of the rule-based approach in XAI and how rule-based explanations, including the proposed method, meet the user perspective and requirements for both content and presentation. The software created and a detailed report containing the full experimental results are available on the GitHub repository (https://github.com/ruleminer/FI-rules4XAI ).
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
- Europe > Poland > Silesia Province > Katowice (0.04)
- Overview (1.00)
- Research Report > Promising Solution (0.66)
- Transportation > Air (1.00)
- Health & Medicine (1.00)
From Local Concepts to Universals: Evaluating the Multicultural Understanding of Vision-Language Models
Bhatia, Mehar, Ravi, Sahithya, Chinchure, Aditya, Hwang, Eunjeong, Shwartz, Vered
Despite recent advancements in vision-language models, their performance remains suboptimal on images from non-western cultures due to underrepresentation in training datasets. Various benchmarks have been proposed to test models' cultural inclusivity, but they have limited coverage of cultures and do not adequately assess cultural diversity across universal as well as culture-specific local concepts. To address these limitations, we introduce the GlobalRG benchmark, comprising two challenging tasks: retrieval across universals and cultural visual grounding. The former task entails retrieving culturally diverse images for universal concepts from 50 countries, while the latter aims at grounding culture-specific concepts within images from 15 countries. Our evaluation across a wide range of models reveals that the performance varies significantly across cultures -- underscoring the necessity for enhancing multicultural understanding in vision-language models.
- Asia > East Asia (0.20)
- Asia > Southeast Asia (0.15)
- North America > Central America (0.14)
- (59 more...)
Dynamical mixture modeling with fast, automatic determination of Markov chains
Miles, Christopher E., Webber, Robert J.
Markov state modeling has gained popularity in various scientific fields due to its ability to reduce complex time series data into transitions between a few states. Yet, current frameworks are limited by assuming a single Markov chain describes the data, and they suffer an inability to discern heterogeneities. As a solution, this paper proposes a variational expectation-maximization algorithm that identifies a mixture of Markov chains in a time-series data set. The method is agnostic to the definition of the Markov states, whether data-driven (e.g. by spectral clustering) or based on domain knowledge. Variational EM efficiently and organically identifies the number of Markov chains and dynamics of each chain without expensive model comparisons or posterior sampling. The approach is supported by a theoretical analysis and numerical experiments, including simulated and observational data sets based on ${\tt Last.fm}$ music listening, ultramarathon running, and gene expression. The results show the new algorithm is competitive with contemporary mixture modeling approaches and powerful in identifying meaningful heterogeneities in time series data.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > China > Jiangsu Province > Nanjing (0.04)
- Oceania > New Zealand (0.04)
- (5 more...)